Exploiting both local and global constraints for multi-span statistical language modeling

نویسنده

  • Jerome R. Bellegarda
چکیده

A new framework is proposed to integrate the various constraints, both local and global, that are present in the language. Local constraints are captured via ngram language modeling, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, resulting in several families of multi-span language models for large vocabulary speech recognition. Because of the inherent complementarity in the two types of constraints, the performance of the integrated language models, as measured by perplexity, compares favorably with the corresponding n-gram performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Span statistical language modeling for large vocabulary speech recognition

The goal of multi-span language modeling is to integrate the various constraints, both local and global, that are present in the language. In this paper, local constraints are captured via the usual n-gram approach, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, result...

متن کامل

OF THE IEEE , AUGUST 2000 1 Exploiting Latent Semantic Information in Statistical Language Modeling Jerome

| Statistical language models used in large vocabulary speech recognition must properly encapsulate the various constraints, both local and global, present in the language. While local constraints are readily captured through n-gram modeling, global constraints, such as long-term semantic dependencies, have been more diÆcult to handle within a data-driven formalism. This paper focuses on the us...

متن کامل

Context scope selection in multi-Span statistical language modeling

A multi-span framework was recently proposed to integrate the various constraints, both local and global, that are present in the language. In this approach, local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. The complementarity between these two paradigms translates into improved modeling per...

متن کامل

Speech recognition experiments using multi-span statistical language models

A multi-span framework was recently proposed to integrate the various constraints, both local and global, that are present in the language. In this approach, local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. The performance of the resulting multispan language models, as measured by perplexity...

متن کامل

Large vocabulary speech recognition with multispan statistical language models

Multispan language modeling refers to the integration of the various constraints, both local and global, present in the language. It was recently proposed to capture global constraints through the use of latent semantic analysis, while taking local constraints into account via the usual n-gram approach. This has led to several families of data-driven, multispan language models for large vocabul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998